Retention modeling methodology for airlines

ABSTRACT

A method of building a customer retention model for commercial passenger airlines industry is described. The major contributions of this invention are: By carefully and thoroughly investigating the background and the current deregulated, competitive environment of the airline industry, a competitive market approach of defining retention for this industry is proposed in detail. A new Customer Value Metric Model (CVMM) is proposed and described. A variety of calculating methods is presented. These methods will provide airline industry more accurate and balanced measures of their high valued customers. Data elements and data sources, both internal and external, are discussed and identified. These data elements are also ranked by their potential use to the retention model. A detailed, step-by-step data analysis and model building process is described, which serves as a guideline to any analysts, project managers or other personnel who may be involved in such an engagement.

FIELD OF THE INVENTION

[0001] The present invention relates generally to retention modelingmethodologies, and more particularly, to a retention modelingmethodology for airlines.

BACKGROUND ART

[0002] The airline industry is one of the leading industries in today'sworld. By one estimate, the U.S. airline's annual revenue in 1997 was$88 billion, of which, 90%, or $79.5 billion was from passenger fares.In the U.S. domestic air travel accounts for 78% of total air traffic,while international travel accounts for 22%. For all the air traffic inthe U.S., 40% of enplanements are for business travel and 60% are forvacation or personal travel.

[0003] Since the late 1970's, along with the deregulation of the U.S.commercial airline industry, the competition in the airline industry hasintensified. With the increase in competition has come an increasedemphasis on retention of valued customers. Across most industries thebasic assumption of customer relationship management is that the cost ofcustomer acquisition is much greater than the cost of customerretention, (i.e., it costs less to retain existing customers then togain new customers. Thus it is very important to airlines to retainvalued customers).

[0004] A solution using internal and external data and professionalservices to identify those customers “at risk” of changing their airtravel carriers could greatly reduce the time and cost to retainhigh-valued customers. Consequently, by implementation of such asolution, including improved service process and successful marketingcampaigns, a company could achieve the goal of retaining its high-valuedcustomers.

[0005] In order to understand the retention question in the AirlineIndustry, it is important to understand the airline industry and itsbusiness process. The section that follows describes briefly thefundamentals of the airline industry, the airline industry of the U.S.in particular. Then the next section is a description of the businessprocess.

[0006] Fundamentals of the Passenger Airline Industry

[0007] The Effect of Deregulation in the U.S. and Stagflation in theEarly 1980's

[0008] The current passenger airline industry is the result of theevolution of the industry from the U.S. airline deregulation. Theairline deregulation is officially marked by the U.S. Congress enactingthe Airline Deregulation Act (ADA) in October 1978. The yearsimmediately following the passage of ADA constituted a period of highinflation accompanied by significant economic slowdown (thus the termStagflation in economic literature), caused mainly by an unprecedentedincrease in oil prices (so called oil shock). Jet fuel pricesskyrocketed to all time highs during the period from 1979 to 1982causing the airlines' operating costs to increase more than 50 percent.The rapid increase in the price of oil not only pushed up costs of theairline industry, but also dragged down the U.S. economy into arecession in 1980.

[0009] Since air travel is very sensitive to cost, and to the economicenvironment in general, the combination of higher air fares andrecession led to a substantial decline in traffic volume and profit forairlines. This unfavorable economic environment plus the uncertainty ofthe marketplace brought about by deregulation forced the airlineindustry into tremendous hardship. Most airlines were ill-prepared forthe deregulated competitive marketplace. Many airlines went bankrupt.Some major airlines have been out of business ever since; some othersrecovered from this situation by either becoming low cost carriers orre-inventing themselves with new ownership and management.

[0010] Today, there are basically no economic regulations imposed on theU.S. airline industry. Without price ceilings, the airlines determinethe fare and the discount based on their operational costs and marketingconcerns. Without route regulation the airlines now have more freedom todesign their own route network. The removal of market entry barriersallows new carriers to enter, and local carriers to expand intointerstate, long haul services.

[0011] The competition following deregulation has changed the landscapeof the entire airline industry. Some well known, trunk carriers ceasedoperations, while some lesser known local or intrastate carriers becamemajor players in the interstate marketplace. The barrier whichregulation established between local, intrastate carriers and long-haul,interstate carriers has disappeared, an in this new environment, theairlines have developed a hub-and-spoke routing network.

[0012] Development and Effects of Hub-and-Spoke Routing Network

[0013] The flight services offered by airlines basically are short hauland long haul. For the U.S. airline industry, short haul means less thanone hour jet flight, (i.e., mostly) intrastate or local, and long haulmeans long distance flight, (i.e., mostly) interstate.

[0014] There are roughly 50,000 city-pairs between which passengerstravel within the United States. Each pair is customarily called a“market” in airline industry. However, the nature of economies of scalein aircraft determines that only about 2,000 of these markets havenonstop service. In most markets, passengers have to make anintermediate stop and change planes en route to their ultimatedestinations. Such routing is most common for passengers traveling toand from small or midsize cities where there is not sufficient trafficvolume to justify nonstop service. The benefits of air travel are speedand convenience. Any problems causing delay or long waiting times willnot be tolerable. Passengers prefer to use a single airline for theirtrips, thus reducing difficulties and potential risks.

[0015] For airlines, larger aircraft generally have lower averageoperating costs per seat mile. However, for short hauls of up to 1,000miles, twin-engine aircraft do not have a much higher average operatingcost per available seat-mile compared to larger aircraft. Smalleraircraft require fewer pilots and service crews and offer higher fuelefficiency. The development of the hub-and-spoke route network helpedthe local carriers (for example, USAir) expand into the longer haulmarkets. Some of the trunk carriers (such as Delta and United) alsoquickly adapted their route network design and developed hub-and-spokeoperations at major airports throughout the United States. Now, shorterhaul routes operated by twin-engine smaller aircraft serve as “feeders”to the airline's major hubs. At each hub, the airline operates hundredsof flights everyday, with densely scheduled arrival and departingflights (called bank). This provides ample possibilities of connections.Today, most major United States airlines operate hub-and-spoke networks.

[0016] As a consequence of the hub-and-spoke routing network, a highproportion of a carrier's flights originate or terminate at an airportwhere it operates a hub. The airlines provide much less nonstop servicesto city-pairs. This network gives airlines benefits of economies ofscale and allows them higher operational efficiency. One of the mainmeasurements of an airline's operational efficiency is the load factor,which shows the percentage of seats that are filled. A thin marketusually has a low load factor. With the operation of hub-and-spokenetworks, airlines are able to substantially increase their load factorsfor the flights departing from or arriving at the hubs.

[0017] Most major cities have at least one carrier operating a hub attheir airport. Some larger cities usually have more than one carrieroperating hubs at their airports. One major consideration when airlineschoose their hub location is the potential local traffic volume. Thatis, the number of travelers available in the surrounding metropolitanarea. The hub-and-spoke network allows residents at these major citiesto travel to most destinations with direct flights. On the other hand,travelers to or from small or midsize cities generally have flights tohubs where they can receive convenient connecting services to theirultimate destinations. The hub-and-spoke networks provide passengers thebenefits of convenience, easy connection, low layover time, and directtransport of baggage, all at a reasonable price. These help make airtravel more convenient and popular.

[0018] Pricing Polices of the U.S. Carriers

[0019] Though pricing policies differ among the airlines, the basicprinciples are the same. Fares in a thin market are generally higherthan in a dense market, and the fares in the short-haul markets tend tobe comparatively higher. Though competition in the marketplace drivesthe pricing structure, the economic reason for pricing disparity is thevalue of time. Air travel saves time and passengers will choose airtravel when the value of the time saved by air travel is higher than theextra expense occurred. The time sensitivity of passengers is criticalin determining the load factor and pricing.

[0020] Another factor that explains the airline's pricing policy is theeconomy of scale. The major assets of the airline are the airplanes.However, the seats on the airplane are “perishable” assets in the sensethat when the airplane takes off, the unfilled seats are useless to theairline. On the other hand, the cost of serving one additional passengeron a flight is substantially low. Therefore, airlines have a strongincentive to increase the number of passengers on their flights. One wayto achieve this goal is to reduce the prices; thus discount fares areimportant to drive the load factor. But offering across the boarddiscount fares will lead to a reduction in revenue and airlines realizethat it is important for them to target only a subset of passengers fordiscount. The following are common practices in the airline industry:

[0021] Restrictions associated with the discount fare—These restrictionsinclude advance-purchase, minimum stay, non-refundable, etc. Inaddition, some discount fares have designated fly date. This policy willdistinguish the business travelers from the leisure travelers becausethe business travelers usually cannot meet these restrictions.

[0022] Capacity-Control—Airlines control the number of seats availablefor discount fares on each flight. This policy helps airlines reduce theprobability that a passenger who is willing to pay the full coach farewill not be able to get a seat on the preferred flight.

[0023] Segmented Days—The airlines segment a day's time into severaltime bands. The availability of seats for discounted fares is differentfor different time bands. For example, at peak times such as lateafternoon and evening, there are fewer seats for discount fares than innon-peak time bands.

[0024] The result of the above practices is that passengers in the sameflight may actually pay totally different fares even for the sameservice-class. However the passengers who pay higher fares are more timesensitive and prefer the flexibility to fly the flights that theyselect. These differences can be used to classify customers in theanalytical modeling process.

[0025] International Aspects of Airline Industry

[0026] The international market is the fastest growing market for majorairlines. For the major U.S. carriers, international travel accounts for27% of total traffic and 22% of revenue. Traffic between the U.S.,Europe and Latin America is growing at an estimated 10% per year. Forthe first four months of 1999, the growth rate was 7.6%. InAsia-Pacific, even though the economic conditions are not favorable, airtravel volume is still growing, and for the first quarter of 1999, thetraffic growth rate for Asian-Pacific airlines was 3.9%.

[0027] International aspects of airline industry are somewhat differentfrom the U.S. By international, we mean the airlines of non-U.S.countries (foreign airlines) and the airlines that operate in theinternational market. The following are major characteristics ofinternational airlines:

[0028] Unlike the U.S. airlines, most foreign airlines are stillregulated or controlled by their respective governments.

[0029] Though most foreign airlines operate from a few major hubs intheir countries, they do not operate over a vast hub-and-spoke networksuch as in the U.S.

[0030] Airlines' international operations are dictated by bilateralagreements. These agreements determine the city (hub) and country, andschedule.

[0031] International airlines tend to be long haul service providers andoperate over a city-pair route, not with a bank of flights.

[0032] International airlines' pricing is regulated by an internationalorganizational body of airlines, though the role of this cartel isdiminishing.

[0033] These characteristics of international airlines provide advantageto retention modeling, since the customers who fly over internationalroutes are easier to identify. The airlines operating in any particularinternational market are few; therefore the customers have less choiceand benchmarking and marketing is easier.

[0034] A Business Process Model for Air Travel

[0035] Airlines operate flights on a predetermined schedule. The originand destinations (O&D), the departure and arrival times, intermediatestopping points, and equipment used are all prescribed. It is very rarethat a passenger carrier will fly outside their schedule. That meansthat air travel service is not offered “on demand.”

[0036] In general, customers select airlines based on the followingconsiderations:

[0037] Their travel needs and how flexible their travel might be;

[0038] The availability of the flights on the specific time and route;

[0039] The pricing (fares willing to pay);

[0040] The convenience (such as departing time, change of flights enroute, the duration of the flight, arrival time, the distance of theairport from their residence, etc.);

[0041] Quality of service or customer satisfaction (from their pastexperience with the carrier);

[0042] The benefits of the frequent flier programs if they have enrolledin any;

[0043] The competitors' offerings.

[0044] When the customer's preferred price range, timing, and O&Dmatches with the available flight offered by the airline, the customercan make a reservation (booking) and then purchase a ticket. The bookingprocess can be conducted through travel agencies, or by calling directlyto the airlines or via the internet. Despite the increasing usage of theinternet, travel agencies are still the number one source for air travelreservation. Most business travels are booked through travel agenciesand many corporations retain their own travel agencies to handle theiremployees' business travel needs. Through booking, the customers, withthe help of travel agencies, will find a matching flight offered by anairline to their destination, on their preferred traveling time, attheir accepted prices. In a competitive market, the customers usuallyhave several choices.

[0045] The passengers can cancel their reservations before the travelhappens. However, certain penalties will accrue with the cancellation,based on the type of ticket they booked. The airlines offer differentlevels of service: coach (economy), business, and first class. All theseservices generate revenue to the airlines. Of course, the higher theclass of the service, the more revenue the airline earns.

[0046] Most major airlines offer frequent flier programs to theircustomers. When a customer is enrolled in a frequent flier program, eachtime the customer flies, the mileage for the length of the flight willbe entered into the airline's computer system. As the customer'saccumulated total mileage reaches a pre-determined level, he/she willearn the right for a bonus flight to their selected destinations or afree upgrade. From the airline's point of view, this kind of air travelis called “reward” flight. The mileage earned in reward flight will berecorded as “bonus mileage”, but the flight generates no revenue to theairline. Airlines impose restriction on when and how a frequent flyercan redeem mileage and obtain benefits.

[0047] Defining Retention

[0048] Two Types of Attrition

[0049] Retention means keeping or retaining, existing customers. Theretention models described below assume that the airlines want to retainhigh-valued customers. The determination of which customers are highvalued is discussed infra.

[0050] The need for retention activities by the airline comes from thefact that in a competitive market customers have the ability to choosetheir suppliers. The opposite of retention is attrition, which oneauthor defines as follows:

[0051] “As applied to customers, it is that state in which a customer,for personal reasons, begins to question continued patronage of asupplier.”

[0052] This section first defines two types of attrition:

[0053] contractual attrition, in which a customer, who has a contractwith a supplier, cancels the contract and transfers his business toanother supplier; and

[0054] situational attrition, where there is no contract for servicesbut the customer switches suppliers because the situation makes the newsupplier seem more desirable.

[0055] When considering the definition of attrition for the airlineindustry, it must be understood that there are fundamental differencesbetween the passenger airline industry and other industries, such as thetelecommunication industry. One major characteristic for telephoneservices is that customers usually have an existing service contractwith the carrier. This contract stipulates that the customer subscribesto the telephone services provided by the telecommunication servicecarrier. Through this subscription, the customer actually purchases anoption to make and receive calls. This option provides customer access(not usage) to the telephone network. In the United States, anothermajor characteristic is that telephone companies are supposed to provideuniversal service to all households. Therefore, not only is a customerassumed to use telephone services regardless which carrier provides theservice, but also a customer expects that the service will be availablewhenever the customer needs it. Customer attrition in telecommunicationsis termination of the existing contract with the carrier. When thathappens, the service provider knows that this customer is going todefect and assumes that this customer will switch to another competitorfor the telecommunications services. This type of attrition is calledcontractual attrition.

[0056] The passenger airline industry is different from thetelecommunications industry. First, there is no contractual relationshipexisting between a customer and an airline for air travel services.Customers do not need to purchase an option to access the airlineservices. Customers can choose when to fly, where to fly, which airlinesto fly, anytime, anywhere, all at their own free will and preference,without a binding contract. For example, a customer can walk into anairport, approach an airline ticket counter, and ask for a “stand-by”ticket. That means, whenever a flight to his/her destination has avacant seat, he can buy the ticket and board the airplane immediately.On the other hand, an unexpected schedule change of a customer may leadthe customer to change his/her flight, within the same airline or evento switch to another airline.

[0057] Furthermore, unlike the telecommunications or other publicutility services, where the services are “on demand”, that is, they areavailable around the clock, customers' choice of airlines areconstrained by the availability of flights to and from theirdestinations. In order for a customer to choose a certain flight, theairline's offering must match with the customer's preference. Forexample, a customer who resides in city A usually prefers to fly onairline S because he is a member of airline S's frequent flyer program.When he needs to fly from city A to city B, if airline S does notoperate non-stop service on that market, this customer may then chooseanother airline operating non-stop service in that market. Of course,the hub-and-spoke network allows the customer to fly from city A to cityC (another hub of airline S), then change flights to city B. But, thatmay take more time or require flights not in the customer's preferredtime band. Under those circumstances, this customer might choose anotherairline for this trip. Does that mean the customer was about to defect?Not necessarily. He might come back to airline S for trips whenever theflights were “right”. Or he might defect if he found the other airlineoffers better services or better choices.

[0058] Another difference is that there is no assumption of universalservice for air travel. In spite of increasing air travel volume, flyingis still not considered the first choice of travel means for manypeople. In fact, there are other forms of travel, e.g., automobiles,buses, and trains, and so the elasticity of substitution for flying isusually high. There is also a substitution effect betweentelecommunication and airline. Along with the rapid expansion oftelecommunication, the need for flying decreases. When a customer stopsflying an airline, he may or may not “switch” to another airline. He maynot need to fly as his job or business has changed and he may choose todrive because the traveling distance has been reduced or because drivingis more convenient, e.g., he may choose to make a conference callinstead of traveling to a meeting place. Reduction in flying mileageitself is not determinative of whether the customer is defecting or not.This type of defection can be called situational attrition. Insituational attrition, because there is no contractual relationshipbetween the customer and the supplier, the customer chooses theirsupplier based on their current need, the availability of the services,and other considerations specifically related to the situation.

[0059] Modeling retention for situational attrition is a much morechallenging task to an analyst. The foremost question the analyst needsto answer is how to define defection? In other words, how to define thesubgroup of the existing customers who still need the services, but arehighly likely to change their service provider. There have been severalapproaches proposed for defining defection in the passenger airlineindustry.

[0060] Operational Definitions and Descriptions

[0061] Operational definitions use specific information from customerdatabases to determine categories for customers. The categories mayinclude loyal customers and defectors. While such operationaldefinitions may work, there are problems with them in an airlineenvironment. Some of the possible definitions and problems are examinedand discussed below.

[0062] One approach is to define retention based on the operationalinformation available from airlines' operational and revenue databases.This approach distinguishes loyal customers from customers who used tobe loyal but have demonstrated defection behavior. There are severalpossible definitions derived from this approach. Parameters P, Q, X, Y,and Z are used in these definitions and their values can be determinedempirically through analysis of customer data, as follows:

[0063] P: the time period during which a steady flying pattern can beestablished to identify loyal customers (This should be a minimum of oneto two years.);

[0064] Q: the time period during which different flying patterns can beobserved to distinguish the defectors from the loyal customers (Thisshould be a minimum of two years in order to account for seasonalpatterns.);

[0065] X, Y: average monthly flying miles (or frequency, or revenue);and

[0066] Z: a predetermined percentage or measurement value.

[0067] The operational definition approach described above is summarizedin FIG. 1. The following elements contribute to the operationaldefinitions of loyal customers (retention) and defectors (attrition):

[0068] (1) Substantial Decrease in Miles Flown:

[0069] A loyal customer is one whose average monthly mileage traveledover the past P months was greater than X miles and for the consecutiveQ months, this loyal customer's average monthly traveling mileage was ator above the X level.

[0070] A defector is a customer whose average monthly mileage traveledover the past P months was at or above X miles, however, for theconsecutive Q months, this customer's average monthly traveling mileagehad dropped below Y miles.

[0071] Furthermore, the magnitude of the dropping of the average monthlytraveling miles from X to Y is considered “substantial” if it exceeds Z%.

[0072] (2) Gradual Decrease in Miles Flown:

[0073] A loyal customer is one whose average monthly mileage traveledover the past P months was greater than X miles and for the consecutiveQ months, this loyal customer's average monthly traveling mileage was ator above the X level.

[0074] A defector is a customer whose average monthly mileage traveledover the past P months was at or above X miles, however, for theconsecutive Q months, this customer's average monthly traveling mileagehad dropped below Y miles.

[0075] Furthermore, the magnitude of the dropping of the average monthlytraveling miles from X to Y is considered “gradual” if it is less than Z%.

[0076] (3) Significant Decrease in Flown Revenue:

[0077] A loyal customer is one whose average monthly revenue generatedfrom air travel over the past P months was greater than $X and for theconsecutive Q months, this loyal customer's average monthly revenue wasat or above the $X level.

[0078] A defector is a customer whose average monthly revenue generatedfrom air travel over the past P months was at or above $X, however, forthe consecutive Q months, this customer's average monthly revenue haddropped below $Y.

[0079] Furthermore, the magnitude of the dropping of the average monthlyrevenue from $X to $Y is considered significant if it is greater orequal to Z %.

[0080] (4) Decrease in Frequency of Trips:

[0081] A loyal customer is one whose average monthly number of segmentsflown over the past P months was greater than X and for the consecutiveQ months, this loyal customer's average monthly number of segments flownwas at or above the X level.

[0082] A defector is a customer whose average monthly number of segmentsflown over the past P months was at or above X, however, for theconsecutive Q months, this customer's average monthly number of segmentsflown had dropped below Y.

[0083] (5) Change in the Share of the Customers' Total Air TravelExpenses:

[0084] A loyal customer is one whose average monthly ratio of ameasurement over the past P months was greater than X and for theconsecutive Q months, this loyal customer's average monthly ratio was ator above the X level.

[0085] A defector is a customer whose average monthly ratio of ameasurement over the past P months was at or above X, however, for theconsecutive Q months, this customer's average monthly ratio had droppedbelow Y.

[0086] This ratio of share and the measurement of the customers' totalair travel expense are undefined, and will depend on the availability ofthe external data.

[0087] (6) Change in the Customer's Elite Club Status:

[0088] Most frequent flyer programs establish elite passenger clubs,usually having several levels of membership, such as gold, silver,bronze. A customer may become a club member with a certain standing bycumulating respective mileage-points. These club members are evaluatedby the airline periodically and anyone whose mileage-points havedecreased is re-classified to a lower grade membership. Thisre-classification is used to identify an at risk customer when acontinuing downgrading is found.

[0089] (7) Change in Travel Pattern:

[0090] During P months, a customer's pattern of flying can be determinedby several measurements, such as revenue generated, routes flown, faretype, destinations, staying time, etc. Then, the same factors can beexamined during the window period of Q months, or the same time frame ofthe previous year. The comparison of these factors may reveal a changein the customer's flying pattern. Combined with one of the abovedefinition measurements, a possible defector may be identified. This isa broad definition that offers flexibility and the ability toaccommodate to data; however, it may require substantially more customerdata (such as external or socioeconomic data) and a better understandingof the customer.

[0091] The advantages of this operational definition approach are:

[0092] The definitions are derived directly from the airline's ownoperational data (except probably the definition 7);

[0093] The definitions are relatively easy to accommodate to theavailability and changes of the data;

[0094] The approach takes into consideration the customers' historicpattern of air travel; and

[0095] The approach is thought to provide the direct measure of acustomer's intention to defection from their current carriers.

[0096] However, this approach does not consider the uniquecharacteristics of the passenger airline industry e.g., situationalattrition as discussed earlier. A passenger's changing travel patternfor a prolonged period (time frame Q) can be caused by one or many ofthe following reasons:

[0097] Change of job or business need;

[0098] No available flights to or from the selected destinations offeredby the airline;

[0099] Flights available from the airline do not satisfy the customer'spreference;

[0100] Competitor's offers are better;

[0101] Other personal reasons; and

[0102] Customer intends to defect.

[0103] Therefore, simply observing the dropping of average monthlyflying miles or revenue contributions may not warrant the conclusionthat the customer is going to defect. In fact, one study has shown that,“Job has not required flying recently” and “Changes inJob/Responsibilities” are the two most important reasons for decreasedor even stopped flying. Job-related changes account for over 60% of lostbusiness. When the situation becomes “right”, the customer may very wellcontinue to fly the same airline. From the customers' point of view,since there is no contractual relationship with the airline, there is noneed for the customer to take deliberate actions, such as termination ofservice contract or not renewing the contract, to defect.

[0104] Unless the benefits associated with the continuing relationshipwith the same airline are so overwhelming, the customer probably alwaysselects the most convenient, fastest and cheapest options.

[0105] Another problem with the operational definition approach is thatthis approach does not consider competitiveness in the marketplace. Adefection is defined within a competitive market framework. When morethan one supplier in the same market provides similar products andservices and a customer who has been loyal toward one provider for acertain period of time chooses another provider for the same services,the provider who lost the customer will see that loss as defection orattrition. Obviously, the key is that the customers have choices and thecompetitive market provides the choices to the customers. In a monopolymarket, the customers have no choices to select their service providersand therefore there is no attrition. The same is true in the airlineindustry. If for certain markets, only one airline operates in thosemarkets, then the customers have no choice but to fly that airline. Evenif more than one airline operates in certain markets, if for a certaindate or time band, there is only one operating, then the customers stillhave no choice. As discussed before, most city-pairs in the UnitedStates have no non-stop service of air travel. For the customers to orfrom small or midsize cities, only a few airlines operate the short haulflights in those markets, and most of those flights are to feed thehubs. For example, a customer flying out of Ithica, N.Y., the onlychoice is currently USAir Express. USAir operates in those marketsbecause historically it was a local carrier with an operation charter inthose markets. The customer flying USAir Express may continue to flyUSAir from one of its hubs to another hub. Is this customer a loyalcustomer to USAir? Maybe or maybe not. First, this customer has nochoice, and second, since this customer has to fly USAir, he/she mayjoin USAir's frequent flyer program to gain benefits and thus continueto fly USAir. We do not know what the customer will do if other airlinesoperate in the same market and offer competitive schedule and benefit.

[0106] In summary, retention modeling based on operational definitionsof defection for airline industry target a population so heterogeneousthat no unique behavior pattern can be identified and predicted. Inaddition, without a competitive market environment, no meaningfuldefection actions can be observed. Thus, there is a need in the art foran improved method of modeling customer retention for airlines.

DISCLOSURE/SUMMARY OF THE INVENTION

[0107] It is therefore an object of the present invention to provide animproved airline customer retention modeling methodology.

[0108] Another object of the present invention is to enable airlines toimprove customer relationship management.

[0109] The components enable the Passenger Carrier Airlines toeffectively address “top-of-mind” Customer Relationship Managementissues, such as, how to retain high-valued customers.

[0110] The solution components were developed based on extensivecommunications industry, data warehousing, and data mining experience.

[0111] The above described objects are fulfilled by a method of buildinga customer retention model. Data elements and data sources areidentified. A data file format is laid out and statistical andanalytical packages are identified. The statistical and analyticalpackages are applied to data from the data sources fulfilling the dataelements identified in the data file format to perform customerretention. In an alternate embodiment, the method includes applying thestatistical and analytical packages to data from the data sourcesfulfilling data elements identified in the data file format to identifycustomer for customer retention.

[0112] Still other objects and advantages of the present invention willbecome readily apparent to those skilled in the art from the followingdetailed description, wherein the preferred embodiments of the inventionare shown and described, simply by way of illustration of the best modecontemplated of carrying out the invention. As will be realized, theinvention is capable of other and different embodiments, and its severaldetails are capable of modifications in various obvious respects, allwithout departing from the invention. Accordingly, the drawings anddescription thereof are to be regarded as illustrative in nature, andnot as restrictive.

[0113] Ideally, the Analytic Modeler uses the Teradata Warehouse, builtfrom the Logical Data Model for an Airline as the model's data source.The data preparation process is likely to be simplified when the data istaken from the warehouse; however, a data warehouse implementation isnot required.

BRIEF DESCRIPTION OF THE DRAWINGS

[0114] The present invention is illustrated by way of example, and notby limitation, in the figures of the accompanying drawings, whereinelements having the same reference numeral designations represent likeelements throughout and wherein:

[0115]FIG. 1 is a chart of an operational definition of customerloyalty;

[0116]FIG. 2 is a high level chart of the predictive power of theretention model of the present invention; and

[0117]FIG. 3 is a high level diagram of an analytical modeling datastructure used in an embodiment of the present invention.

BEST MODE FOR CARRYING OUT THE INVENTION

[0118] A method and apparatus for modeling customer retention forairlines are described. In the following description, for purposes ofexplanation, numerous specific details are set forth in order to providea thorough understanding of the present invention. It will be apparent;however, that the present invention may be practiced without thesespecific details. In other instances, well-known structures and devicesare shown in block diagram form in order to avoid unnecessarilyobscuring the present invention.

[0119] The present invention described herein is related to, and forms apart of an acquisition and retention modeling methodology as describedin copending applications, “An Acquisition Modeling Method forAirlines”, (Docket No. 8896 (3225-114) and “Logical Data Model forAirline Customer Relationship Management” (Docket No. 8904 (3225-118),both assigned to the present assignee and incorporated herein in theirentirety by reference.

[0120] Customer Profile and Competitive Market Approach

[0121] Based on the characteristics of the airline industry and thecompetitiveness of the air travel market described above, an inventiveapproach to defining defection/attrition and thereby to definingretention is described herein. This approach is an improvement over theprevious approach. The competitiveness and the situational attrition ofthe passenger airline industry is taken into consideration. Thisapproach leads the retention models to target a much more homogeneouspopulation within a competitive market environment and enhances thepredictive power and accuracy of the retention models.

[0122] The competitiveness of the market means customers may select theairlines for their travel needs and implies that there may becompetitive flights available to the airline instead of another is aloyal customer. If a particular customer's usage of an airline hasdropped for a prolonged period, then this may be a customer “at risk” ofdefection.

[0123] To determine the competitiveness of the market, first we considerthe market share of each major airline. The market share information isreadily available. We want to consider customers who fly in marketsneither dominated by a particular airline (for example, the clientairline), nor negligible to that airline. Neither of those markets isconsidered competitive for our purposes. Another factor is the number ofplayers in a market. If there are few players and each has a reasonablemarket share, then that market is highly competitive.

[0124] This consideration leads us to believe that the retentionmodeling efforts should concentrate on the few cities where a clientairline has established hubs. Those hubs carry most of the trafficvolume of the airline, both from a local market as well as from thespokes feeding the hubs. These hub-markets are:

[0125] Not dominated by only one airline;

[0126] Two or more airlines operating from the hubs;

[0127] The airlines offer competitive flights; and

[0128] The hubs pick up a large amount of traffic volume, both localcustomers and transfers from spokes.

[0129] In determining the target population of retention models, first,choose the members of the frequent flyer programs. The frequent flyerprogram provides not only most of the high valued customers but alsomore complete data. Then, from the frequent flyer customers, the highvalued customers are selected based on a Customer Value Model. Studiesof the U.S. airline industry show that less than 20% of the high valuedcustomers contribute over 50% of the revenue and a significant portionof profit to airlines. Therefore, retaining a high valued customer makessignificant contributions to an airline's profit margin. Thus, highvalued customers flying out of a predetermined competitive hub areselected.

[0130] Then, customers' profiles are established, particularly theirtravel patterns. The travel patterns are identified by several factors,such as O&D, travel time (departing and returning time), staying time,number of legs of trips, booking channel, service class, etc. Unlike theseventh definition mentioned above, it is not believed that changing thetravel pattern itself will help define the defection. A changing travelpattern is more of an indication of a changing customer's travel need.How this changing travel need affects the customer's choice of airlinedepends on the surrounding situations and market conditions. Forexample, a customer may continue to fly the same airline even thoughhis/her destinations have changed. A customer may switch to anotherairline even though his/her destinations have not changed but only thedeparting time has changed. Under those circumstances, a closer look ofthe flight data may reveal that in the customer's new time band, theclient airline does not offer the flights that he/she prefers;therefore, this customer has no choice but to switch to another airlineoffering the preferred flight. By choosing the high valued customers inthe competitive hubs, it is assumed that the client airline has thecapacity to serve the customers and offers flights to meet thecustomers' needs. Therefore, a customer drastically reducing his/herusage of the airline is highly likely to switch to a competitor, giventhat there is no major change in the customer's socioeconomic condition.

[0131] According to this approach, the criteria for a loyal customer aredefined as the following:

[0132] The customer has shown a steady trend of flying the clientairline for a predetermined length of time;

[0133] The customer chooses the client airline in a competitive marketenvironment;

[0134] The customer chooses flights operated by the client airline whenthere are competitive flights available; and

[0135] Since the airlines pay much attention to the members of theirfrequent flyers programs, we assume that the loyal customer should be amember of those programs.

[0136] Consequently, customer attrition is defined as:

[0137] The customer used to be a loyal customer;

[0138] The customer still flies in a competitive market;

[0139] The flights operated by the client airline are still available tothe customer;

[0140] The customer may still keep his/her frequent flyer programmembership;

[0141] The customer drastically reduced his/her usage of the clientairline.

[0142] Of course, the usage can be measured by the operationalmeasurements discussed above, e.g., variables P, Q, X, Y, and Z.

[0143] This approach can be summarized in FIG. 2 showing that as thehomogeneity of customers increases by concentration on a sub-group ofthe total population, the predictive power of the retention modelincreases.

[0144] Defining the Dependent Variable

[0145] Once the business question of what to model is clearly defined,the next step is to define the analytical model's dependent variable.The retention model described in this document applies to customer leveldata. The dependent variable for the model reflects the customer'sdecision to continually fly the same airline or switch to anotherairline. This dependent variable needs to be derived from data when:

[0146] A high valued customer base is obtained;

[0147] Loyalty measurement of customers has been established; and

[0148] Customers' attrition/non-attrition behavior can be identified,based on the defection definition discussed supra.

[0149] Historical information on customers' air travel patterns isprovided. A customer, who in period P flew in markets where there issufficient competition in similar flights on the same routes, who hasstopped or significantly reduced the flying of the same airline forperiod Q, is defined as a defection. In addition, the customer has notbeen flying in other market segments. The latter information shows thatthe customer's travel need has not changed.

[0150] The possible causes for defection are independent variables andare derived from the data. The dependent variable field is coded 1 forthe attrition customer's record; otherwise dependent variable the fieldof the customer record is coded 0. This binary variable is the dependentvariable of the retention models.

[0151] For example, a customer who is a member of a frequent flierprogram usually flies round-trip from Newark (EWR) toBaltimore-Washington International Airport (BWI) or Atlanta (ATL) forthe months from January to December of 1998. The markets he flies arehighly competitive, which means there are several airlines available forselection. Examination of data further reveals that he usually flies inthe flights departing from EWR in the morning and stays for a couple ofdays, then flies back in early afternoon flights. He purchased ticketsthrough a travel agency, usually with only one-week advanced booking,and thus paid full fare. The Customer Value model, more fully describedbelow, indicates that this customer is a high valued customer. However,recent data shows that for the months of January to June of 1999 (thewindow period), there are no records of the customer flying on the sameairline (the client airline). If external data is available, the datashows that there is no change of job or address. All these factorsindicate that the customer is highly likely to defect. Thus, theattrition field of the customer record is coded or set to a value of 1.

[0152] Dependencies

[0153] Typically about 60 to 80 percent of a retention analyticalmodeling project is spent on data preparation. For the airline industry,which has a tremendous amount of operational data, building analyticalmodels without a data warehouse is a very difficult, if not impossibletask. At any rate, decisions about data sources, locations andavailability should be solved at the beginning of the analyticalmodeling project. This is accomplished through in-depth discussionsbetween modelers and client airline personnel possessing the appropriateknowledge. It is assumed that the client is prepared and provides thenecessary (internal, transactional) data, at some agreed upon levels ofsummation, in a mutually acceptable form. It is recommended thatanalytical modeler/analysts and project managers do the following:

[0154] Provide the list of data elements (i.e., customer and operationsdata elements model output data, and other desirable data) that may beneeded for building retention models. If a data warehouse (DW) isinstalled, the data elements will be drawn from the DW;

[0155] Engage in discussions with the client's personnel to identifypossible data sources;

[0156] Discuss the possibility of including external data; and

[0157] Lay out the data file format.

[0158] The following are prerequisites for a modeling engagement:

[0159] Determine the location and availability of data sources;

[0160] Decide which data sources (internal and external) will be used;

[0161] Agree on defined data format is;

[0162] If a DW is installed, the above are part of the DW efforts;

[0163] Client airline personnel are prerequisites;

[0164] Responsibility for data source availability is defined.

[0165] Responsibility for providing insight on the data is defined; and

[0166] Acquire statistical and analytical packages.

[0167] In summary, if a DW is installed, the analytical modelers willrely on the DW as the data source; otherwise, the analytical modelersobtain data from the original sources. The more detailed datapreparation process is discussed below.

[0168] Customer Value Metric Model

[0169] Customer valuation is a very important issue for all airlines.When airlines want to pursue either retention, acquisition or businessgrowth, the foremost task is discovering who are the in most valuablecustomers. A sound methodology is required to help airlines solve thisproblem.

[0170] The following describes the definitions of customer value and themethodology used to develop a Customer Value Metric Model (CVMM). Thismodel ranks passenger data and identifies the most highly valuedcustomers for the carrier. The customer valuation model is not aRecency-Frequency Model, commonly known as RFM. The CVMM provides a muchmore sophisticated and balanced methodology to score customers and iscarried on with or without retention modeling.

[0171] Defining Customer Value

[0172] Airlines are genuinely interested in finding high valuedcustomers as described in detail above. The question is what are thecriteria that carriers may use to define customer value? Criteria in thepresent invention include recency (time period), frequency (mileage),and revenue (profit). It is not, however, the standard RFM model withwhich many are familiar. The model described below presents a moresophisticated approach to the problem of defining value.

[0173] As discussed previously, while the airlines' profit margin isgenerally low, the marginal cost of adding one more passenger to anaircraft is also very low due to the economies of scale of the aircraft.Each airline has its so-called break-even load factor. That is thepercentage of the seats that the airline must sell at a given price(yield) to cover its costs including operational costs, airport fees,commissions paid to travel agencies, and other costs. Given a revenuelevel, lower costs, result in a lower break-even load factor. Eventhough revenue and costs vary from one carrier to another, on averagethe break-even load factor is about 65% for the airline industry.

[0174] Most airlines operate very close to the break-even load factor.Therefore, the marginal revenue earned from the sale of one additionalseat on each flight contributes significantly to the airline'sprofitability. Frequent flyer programs are used commonly in the airlineindustry to attract passengers. An industry wide study has shown thatfrequent fliers not only contribute significantly to airlines' revenueand profit, but also make up a large portion of the passenger trafficvolume. Passengers taking more than ten trips a year, though accountingfor only 8% of passenger population for a given year, contribute about45% of air travel volume. This fact tells the airlines that thosecustomers are the most prized ones. Obviously, the customer valuationmodel needs the ability to identify these customers. Thus, one criterionused in the CVMM for high valued customer is flying frequency.

[0175] Another criterion is the revenue contributed by the customer. Asdiscussed before, a passenger paying full-fare is more valuable than apassenger paying a deeply discounted price, even though they may sitnext to each other in the same section of a flight. The airline pricingpolicy distinguishes between these two types of customers. The revenuemeasure is the ticket price minus the airport fee, commission, andcertain taxes, but not the operational cost. Operational cost onaverage, in terms of per seat/per mile, is more or less a constantacross the airline and is not considered, thus simplifying the task.

[0176] Another revenue-related measurement is flying mileage. From arevenue management point of view by the carrier, by flying more miles,the customer generates more revenue.

[0177] These three measurements together create many possiblecombinations. Among them, revenue contribution is the most important,while the other two, (i.e., frequency and mileage, are complementaryfactors). For simplicity, revenue contribution in combination witheither one of the other two measures is used as a classifier.

[0178] These criteria give a three-tiered structure of customer value asshown in Table 1 below: TABLE 1 High Frequency(Mileage)/High RevenueContribution High Frequency/Low Revenue Contribution Low Frequency/HighRevenue Contribution High Mileage/Low Revenue Contribution LowMileage/High Revenue Contribution Low Frequency(Mileage)/Low RevenueContribution

[0179] Of course, the third tier customers, Low Frequency (Mileage)/LowRevenue Contribution, will not be the ideal target of the predictivemodel, while the first tier customers are the most valuable customersfor airlines. The problem is the customers in the second tier. Are theyalso valuable customers? How does an airline deal with these groups?

[0180] Airlines may want to retain the Low Frequency (Mileage)/HighRevenue Contribution customers for obvious reasons. However, thosecustomers may not be so loyal to the airline because the benefits theyreceive from frequent flyer programs are not significant enough to keepthem flying the same airline.

[0181] On the other hand, the High Frequency (Mileage)/Low RevenueContribution customers may be loyal to the specific airline because ofthe benefits from the frequent flyer programs, but their marginalcontributions to the profitability of the airline is low. Airlines maywant to keep these customers only because they hope that these customersmay eventually generate additional revenue. Along with more affinityprograms that airlines have established with credit card companies,hotel and car rental companies, even long distance telephone companies,some customers may be able to accumulate high mileage points withoutcontributing any revenue to the airline. These customers need to beidentified.

[0182] Developing a Customer Value Metric Model (CVMM)

[0183] Data Requirements

[0184] As discussed above, a customer's value is measured by thecustomer's contribution to the carrier's profit. The following dataelements are essential:

[0185] Passenger frequent flyer program membership information;

[0186] Most recent passenger flying data (including departing/arrivalairports, flight numbers, distances flew, etc.);

[0187] Booking channel data;

[0188] Ticking data, gross revenue, and fees paid; and

[0189] Costs.

[0190] These data elements are part of the airline's database.Passengers referred to here are members of the frequent flyer programs.

[0191] Recency Group and Flight Frequency

[0192] The recency group includes passengers who have flown the airlinewithin the airline specified recent time period, for example, in thepast six or twelve months. These are active passengers for the timeperiod under consideration and the flight activities of the passengers,who are members of the frequent flyer program, is summarized for thattime period.

[0193] Flight activities are defined as any revenue generating flightsactually flown during a specified period of time. Each flight activityis measured by a one-way, end-to-end trip. For example, a flight fromNational Airport in Washington, D.C. to New York's JFK InternationalAirport, is counted as one flight activity. A flight from Newark Airportto Los Angeles, via Cleveland, is counted as one flight activity, eventhough the passengers need to unboard the airplane at the Clevelandairport and board another flight to Los Angeles. On the other hand, ifthe flight from Newark to Los Angeles is a non-stop flight, then thisflight is also counted as one flight activity. The flight activityinformation is available from the passenger ticket reservation systemdata as well as from the flight data of the airline. The counts startfrom the origination airport and end with the destination airport. Allmajor airports have a unique code.

[0194] A summary of all the flight activities within the specified timeperiod for each passenger is the frequency value of the passenger.Reward flights may be included in the database and are counted forfrequency value and the net revenue calculation considers thissituation.

[0195] A summary of the mileage flown in the recency period is astraightforward calculation, obtainable directly from flight activitydata.

[0196] Revenue Contribution

[0197] The next step is to calculate the passenger's revenuecontribution to the airline. The gross revenue contribution is a summaryof the revenue per passenger per flight activity and is the ticket pricethe passenger paid for each leg of the trip or the entire trip.

[0198] Cost Factors

[0199] The costs associated with that flight activity should besubtracted from the gross revenue contribution. The following are costfactors:

[0200] Domestic/International ticket sales costs: sales channels can bedivided into several categories, such as sales through airlinesComputerized Reservation System (CRS), or paperless e-ticket, orpaperless, web-ticket. The costs/fees of each sales channel may bedifferent. These costs should be subtracted from the gross revenuecontribution.

[0201] Travel agent commissions: in addition to the above sales costs,if the ticket issued by a travel agent, a certain percentage of theticket price should be deducted for commission. If the ticket was issuedby another airline, a certain percentage of fees also needs to bededucted.

[0202] Airport Fees: airport landing fees are a significant portion ofthe airline's costs. These fees need to be deducted from the grossrevenue.

[0203] Meals/Beverages: costs of meals and beverages should besubtracted from the gross revenue contribution. However, if the flightactivity is a reward trip, then these costs need not be subtracted,since the costs are embedded in the cost of miles.

[0204] Taxes: certain taxes paid by the airlines should be deducted.

[0205] Those costs are usually either shown on the sales of tickets, orcalculated through carrier specific formula or percentages. As discussedabove, the operational costs are not considered here.

[0206] If the reward flights are included, then the cost of frequentflyer miles needs to be deducted from the gross revenue. Each airlinemay have their own formula to calculate the cost of rewarded miles.Certain specific rates may associate with specific reward redeemed. Fora frequent flyer program, accumulation of miles is not a cost, butredemption of the frequent flyer miles is a cost. For a free upgrade,the lost revenue may be calculated using an airline-specific formula.

[0207] Net Revenue Contribution

[0208] Once the revenue and all costs are calculated, the difference ofthe gross revenue contribution and the overall costs is net revenuecontribution. This is a dollar value measurement for each passenger'scontribution to the airline's bottom line, (i.e., the airline's profitmargin).

[0209] All flight activities, frequency value and net revenuecontribution data are summarized at a passenger level for the recencyperiod. That is, each member of the frequent flyer program should have aunique account followed by other fields that contain all otherinformation.

[0210] Scoring Method

[0211] Scoring for the CVMM uses frequency value and net revenuecontribution value in a common procedure to rank and score the customervalues. After obtaining frequency values (FV) and net revenuecontribution values (CV) for the passengers of the recency group, thetwo values are scored. The purpose of scoring is to identify the groupof passengers who are high frequency flyers and high net revenuecontributors. There are several possible ways to divide and score thedata, a preferred approach is to divide the entire data into foursubgroups. A similar method can be used to divide the entire data intodeciles or any number of subgroups.

[0212] Frequency Value Scoring

[0213] Sort the FV by descending order;

[0214] Determine the 75%, 50% and 25% break points; i.e., divide theentire population into four quartiles, each break point corresponds to afrequency value, for example, at the 75% break point, the FV is 25, atthe 50% break point, the FV is 12, etc.;

[0215] Move the break points when there are ties: for example, if the75% observation is 3,000^(th) record, and its FV is 25, but the3,001^(th) has the same FV, then go down the list, until the FV changesits value. That observation would be the break point. Apply the samemethod to the entire data to determine the break points. The entirepopulation may not be evenly divided when there are ties at the breakpoints; and

[0216] Assign integer values to each of the sub-groups. For example,assign 4 to the records above the 75% break points, 3 to the recordsbetween the 75% and the 50%, 2 to the records between the 50% to the25%, and 1 to the records below the 25%. These are Frequency Scores(FS).

[0217] Net Revenue Contribution Scoring

[0218] Apply a similar method to determine the 75%, 50% and 25% breakpoints for Net Revenue Contribution Value (CV), then the same integervalues (4, 3, 2, 1) will be assigned to each quartile. Those 4 integersare the scores of the FV and CV series. These are Contribution Scores(CS).

[0219] CVMM Scoring

[0220] After scoring for both CV and FV, sort the entire data by thescores-pair series (CS, FS) in a descending order.

[0221] The possible pairs are (4, 4), (4, 3), (4, 2), (4, 1), . . .(1,4), (1,3), (1,2), and (1,1).

[0222] For the records with the same pair, sort by CV. For example ifboth records have (3,2), but one's CV is $2,000, another's CV is $1,850,then the one with CV of $2,000 is above the one with CV of $1,850 insorting.

[0223] If the records still have the same CV, then sort by FV. Forexample, for the records having the same scores-pair (3,2), if they bothhave the same CV of $1,680, then sort them by their FV. The one with ahigher FV will then be ranked higher than the one with a lower FV.

[0224] When all the records have been sorted by their (CS, FS)score-pair, divide the entire population into 100 subgroups. Give eachrecord within a subgroup a numerical value from 100 to 1. Those recordswith the highest 1% of scores are assigned a value of 100; the next 1%are assigned a value of 99. This process continues until the lowest 1%is assigned a value of 1. These assigned numerical values are calledCustomer Value Metric Scores (CVMS).

[0225] The passengers with high CVMS are the High Valued Customers.

[0226] Table 2 is a result of applying the above process to actualairline data. TABLE 2 Customer CV FV CS FS CVMS 1 850 2 4 1 82 2 682 1 41 82 3 450 4 3 3 72 4 503 3 3 3 74 5 122 6 1 4 61 6 159 5 1 4 60 7 12635 4 4 91 8 2202 6 4 4 94 9 510 5 3 4 79 10 180 4 1 3 53

[0227] First, this method handles tier 2 customers subjectively.According to the above table, a (CS, FS)=(4,1) pair always has a higherscore than any (1,4) pairs. That is, low frequency flyers with a highernet revenue contribution are always ranked higher in customer value thanthose with higher flying frequency but lower revenue contributions. Inthe above table, customer 1's CVM score (82) is much higher thancustomer 6 (60) only because it has a higher CV even though customer 1'sFV is much lower (2 vs. 5). This shows that the ranking of a customer'svalue is determined by the sorting procedure. It may be biased whenranking the customers in second and fourth quadrant.

[0228] Second, because tied pairs are sorted first by CV and then by FV,this ranking procedure may cause a biased ranking. Looking at customers3 and 4, for example, since they are the same group (3,3), they arefirst sorted by CV, and then by FV. After sorting, customer 4 obtains ahigher score (74) than customer 3 (72), even though customer 3 fliesmore frequently than customer 4. It is not a big problem in this casebecause the difference between their CVs is relatively small. However,depending on the data size and scoring sensitivity, for a very largedatabase, a little scoring difference may affect a lot of customers'values. In summary, since this method considers the combination of CVand FV, it is a challenge to balance the weight or rank order of the twovalues.

[0229] An alternative method for alleviating bias is either to calculate(a) a ratio of CV/FV, or (b) a multiplication of CV*FV. CV/FV results ina CV per FV, but the problem with this method is a high CV with a lowFV, such as when FV equals 1, is ranked higher, e.g., customer 2 in theTable 2 above. CV*FV is actually a CV weighted by FV, or an index ofcustomer value; however, CV*FV may change the entire ranking from theabove procedure. For example, when applying the multiplication to theabove table, the ranking becomes customer 8 as the highest, thencustomer 7 as the second and customer 9 as the third. The (4, 1) pair,(i.e., customer 2) is now ranked lowest. This method seems to give arelatively balanced ranking of customers' values.

[0230] Alternative Methods for CVM Scoring

[0231] Alternative methods to calculate the CVM scores are nowdescribed. The methods consider CV as the primary measurement forcustomer value, and FV as the desired complementary factor.

[0232] Procedure One

[0233] The first procedure is as follows:

[0234] 1. Calculate the multiplication of CV*FV;

[0235] 2. Sort based on the calculated value.

[0236] 3. Segment the entire CV*FV series into 100 subgroups.

[0237] 4. Assign values to each subgroup (100 for highest 1%, 99 to next1%, . . . , 1 to lowest 1%) as stated before;

[0238] 5. The assigned values are the CVMSs.

[0239] According to this method, Table 2 above will change to Table 3below: TABLE 3 Customer CV FV CV * FV CVM Score 8 2202 6 13212 94 7 12635 6315 91 9 510 5 2550 79 3 450 4 1800 72 1 850 2 1700 82 4 503 3 150974 6 159 5 795 60 5 122 6 732 61 10 180 4 720 53 2 682 1 682 82

[0240] Table 3 results indicate that even though the CV*FV scores give aranking mostly consistent with the previous CVMS method, customer 2 withsignificantly lower FV is now ranked at the bottom. Another observationis that although customer 3's CV is lower than customer 4's, customer 3is ranked higher because of the FV score. This shows that the differencein CV between the two customers will not offset the difference in theirfrequency of flying.

[0241] Procedure Two

[0242] Procedure two further considers a more appropriate weight usingfrequency values.

[0243] 1. Sort based on CV, if there are ties of CV, then sort by FV, indescending order;

[0244] 2. Determine the 75%, 50% and 25% break points and assign avalue, e.g., integers 1-4, to each quartile;

[0245] 3. Calculate the average FV for each quartile;

[0246] 4. If the mean of FV for each quartile is significantly different(using certain statistical procedures such as t-test), then calculatethe ratio of each FV vs. its quartile mean;

[0247] 5. Use these ratios as weight to calculate CV*(FV weight). Thisvalue is called CVFW and is the CVM score.

[0248] This method or procedure gives us a weighted index of CV. Each CVis weighted by its FV weight. FVs higher than the group mean have aweight ratio greater than 1 and FVs lower than the group mean have aweight ratio less than 1. This procedure gives better-balanced scores tohigh CV, high FV customers.

[0249] Procedure Three

[0250] The third procedure uses mileage value (MV) instead of FV toweight the CV. Procedures similar to those discussed above are followedto calculate a mileage-weighted CV.

[0251] 1. Sort by CV. If there are ties of CV, then sort MV, indescending order;

[0252] 2. Determine the 75%, 50% and 25% break points and assign 4-1values to each quartile;

[0253] 3. Calculate the average mileage for each quartile;

[0254] 4. If the mean of mileage for each quartile is significantlydifferent, then calculate the ratio of each customer's mileage vs. itsquartile mean; and

[0255] 5. Use these ratios as weighting to calculate CV*(mileage weight)and call it CVMW.

[0256] Using flight mileage is more appropriate for several reasons.First, airlines always consider flight mileage as an importantindication of customer value. This exactly why airlines have frequentflyer programs and each member of those programs earns points based onthe miles they have flown (not the frequency value). Second, flightmileage is a more accurate measure of flight activities. In our example,a flight from Newark Airport to Los Angeles, can go via Cleveland, orcan be a non-stop direct flight. In either case, FV will be count as 1,but CV will be different and so will flight mileage. A non-stop flightfrom Newark to Los Angeles may be more expensive, but less flightmileage than the non-direct flight. The value offered by the non-stopflight is time savings as discussed above. A customer flying a non-stopflight and paying a higher fare is a highly time sensitive customer,e.g., usually a business traveler. The frequency values do not reflectthis difference in customers. A combined measure of CV and mileagecaptures the nature of flying activities and thus distinguishes the highvalued customers from the rest.

[0257] The process to obtain and prepare the data from which the modelis developed is now described.

[0258] Data Elements—Describes the data elements necessary to executedata analysis and then build analytical models. This section definescustomer, sales channels, and travel agent data, as well as airlineoperational data that is critical for a successful analytical model.Data element tables are shown in Tables 4-6 below, as well as a notationkey table provided in Table 7.

[0259] Table 4 lists those data elements that are important customer andoperational data. These are elements that are needed for the modelsdescribed in this document. The table indicates probable source,importance of the element, and how the element appears in the logicaldata model for airlines. TABLE 4 Customer and Operations Data ElementsData Importance Mapping Source Data Element to Model to LDM IC CustomerID (Frequent Flyer ID) V Yes IC Customer Address, Phone Number, Zip CodeHD Yes IC Contacting Records V Yes IC/IO Flight Data: O & D, Time, Legs,Route, Actual Mileage V Yes IC History of the Customer V Yes IC ServiceClass D Yes IC Member Status HD Yes IC Booking HD Yes IC/IO TravelAgency Code/Location/Type HD Yes IC Cumulated Mileage/Points V Yes IC/IOGross Revenue Contribution V Yes IC/IO Tickets V Yes IC/IO Checking-inHD Yes IC/IO Customer Canceled Flights HD Yes IC/IO Coupon Revenue V YesIO Costs (or formulas, percentages to allocate certain costs items) V NO# IC/IO Baggage: Missing/Mishandled HD NO ## IO Flight Incident: Delay,Canceled, Changed Route, V Yes IC Customer Complaints V Yes IC/IO RewardFlights HD Yes EC Customer Profile (1): Occupation, Employer/EmploymentHistory V Yes* EC Customer Profile (2): Annual Income, CreditRanking/History, V Yes* EC Customer Profile (3): Education, GeneralHousehold Data HD Yes* EC Customer Profile (4): Travel Related-rentalcar, hotel, credit card HD Yes* EC Customer Profile (5): Lifestyle HDYes* EC For Business Owner: Business Type, Annual Revenue, HD Yes ECHome Business Indicator HD NO** EC Business Credit Rating HD Yes ECState 0 Yes EC Zip code/Postal code 0 Yes EC Metropolitan Statistic Areaor Geographical Specific Data 0 Yes EC Area Population 0 Yes EO MarketShare V Yes EO Published Scheduling(all airlines for the selected hubs)V Yes EO Actual Scheduling(all airlines for the selected hubs) V EO HubCapacity(number of flights, Enplanement) HD NO EO Competing HubsOperated by Other Airlines HD Yes EO Airline Service Quality PerformanceHD Yes

[0260] Table 5 lists additional data that may be useful for a retentionmodel. This data may help provide insight into customer satisfactionissues, but is not directly used in the models described in thedocument. TABLE 5 Other Desirable Data Data Importance Mapping SourceData Element to Model to LDM I Price Modifications/Discounting Policy HDYes I Customer Complaints Processing Procedures and HD Yes Standard ICustomer services standard - quality, response time, HD Yes etc. ICustomer satisfaction measurement HD Yes I Any measures encouragingcustomer loyalty HD Yes

[0261] Table 6 lists data elements that are output from the model. Theseare usually the scores attached to a customer record as a result of theanalysis performed by the model. These scores allow the airline to rankcustomers based on their contribution, their likelihood to defect, etc.The scores are usually used to help target a population for a marketingcampaign or for special treatment by the airline. TABLE 6 Model OutputData Data Mapping Source Data Element Explanation to LDM MO Customer IDUnique primary key for customer Yes MO Customer Value Score Score forcustomer value (revenue) Yes MO Retention Score Score indicatingprobability of Yes retention MO Multiplication of Customer Value Derivedvariable Yes Scores and Retention Scores MO Target Indicator: At riskcustomer Indicates customer is target of Yes indicator campaign MODecile Decile of population in which Yes customer is classified

[0262] In the table below, the following codes are used to indicateimportance of the data element and probable sources for the dataelement. TABLE 7 Importance and Data Source Notation Importance to themodel V-vital HD-highly desirable D-desirable N-of questionable or novalue Data Sources I-internal From customer's database IC-internal aboutcustomers Information about customers (name, address, FF number, milesflown, etc.) IO-internal about operations Information about operations(flight, schedules, routes, costs, etc.) E-external Information from thepublic sector or from private vendors of data EC-external aboutcustomers Information about customers (name, address, mortgage, income,number of cars, etc.) EO-external about operations Information aboutoperations (markets, total flights in a given market, total dollarsspent on tickets, etc.) MO-Model Output Output from the analytical model

[0263] Internal Data Sources—Describes the internal operational datasources including customer-base data, revenue management, flightscheduling, sales channel, and travel agency data, etc.

[0264] External Data Sources—Includes business and other socioeconomicdata provided by private vendors, and public data sources.

[0265] Data Extraction—Provides descriptions of the following dataextraction tasks:

[0266] Map the data;

[0267] Extract data from all data sources;

[0268] Clean and condition the data; and

[0269] Create the analytical data file.

[0270] Data Elements

[0271] To successfully execute data analysis and build analyticalmodels, one must know the data structure and the data elements. Amodeler involved in an airline industry engagement is aware of the dataareas described below. The data elements for building analytical modelsis described next and there is no description of the entire datawarehouse.

[0272] The airline provides its operational data, both current andhistoric data. In addition, certain external data is acquired, as theclient desires. The data areas and critical data elements, as shown inTables 4-6, are described next.

[0273] If a data warehouse exists for the carrier, then the analyticalmodelers rely on the DW to obtain the data elements (at least frominternal data sources). Otherwise, the modelers obtain the data directlyfrom the carrier's data sources. The modelers may have to rely on thecarrier's database management system (DBMS) to provide the needed data,but of course, this adds cost and extends project time.

[0274] It is important for the analysts and project managers to knowthat, since the airlines' internal data sources may reside in differentlegacy systems and be managed by different departments, the data may notexist in a usable way and the data integrity may be poor. The matchingrate for external data is sometimes low. The poorer the condition of thedata, the more costly and time consuming is the project.

[0275] Another point worth mentioning is that all internal data sourcesare secured and may be extremely difficult to access the data if thereis no DW. Therefore, a virtual DW or staging area architecture may benecessary.

[0276] Basic Data Structure

[0277] The basic data structure for analytical modeling is describedbelow with reference to FIG. 3. The data structure described here is foranalytical modeling only and does not cover the entire data warehouse,nor is it a substitute for the logical data model (LDM). The LDM forCustomer Relationship Management is described fully in co-pendingapplication entitled, “Logical Data Model for Airline CustomerRelationship Management”, and is hereby incorporated by reference in itsentirety.

[0278] Customers FF 302: the oval area in the center of the figurerepresents the customers who are members of the airline's Frequent FlyerProgram. They are the target population for the retention models.

[0279] The CVM model 304 ranks these customers and identifies the highvalued customers.

[0280] Customer Care 306 provides information about the customers'experience with the airline and its services.

[0281] Booking/Reservation 308. The customers start with the booking andreservation system when they purchase their tickets. They becomerevenue-generating customers only when they actually board the airplane(check in).

[0282] Flights 310 are the product airlines offer to the customers andare the source of revenue. The flight data provides customer's revenuecontribution, mileage, and frequency, as well as destination, route, andother information.

[0283] Flight incidents and service factors 312 determine whether thecustomers are satisfied with the products and services supplied by theairline. These experiences influence a customer's selection of carriers.

[0284] One-way arrow lines in FIG. 3 indicate one-way flow ofinformation, while two-way arrows indicate two-way flow of information.For example, flight data 310 provides information to reward flights 314:(i.e., a one-way flow of information). On the other hand, customer dataor customer FF 302 provides input to CVM model 304, but CVM model willfeed back to the customer data with the ranking results, (i.e., atwo-way flow of information).

[0285] The Data Elements are Now Described in More Detail.

[0286] Customer FF 302: The purpose of retention models is to help theairlines retain their most highly valued customers. Customer FF 302means customers of frequent flyer programs. These customers are thetarget population of the airlines' retention efforts. All other dataelements must be able to link back to this data element, directly orindirectly. This data element provides information about who thecustomers are and where they are, and includes the following additionaldata elements:

[0287] Customer Base: basic information about a customer—CustomerID,name, address, etc.;

[0288] Contacting: how did the customer get contacted?;

[0289] Reward: the customer history of earning reward points and bonus;

[0290] Profile: what does the customer look like—occupation, education,other socioeconomic elements;

[0291] Segmentation: customer segmentation, how do they behave accordingto certain criteria;

[0292] Customer Life Cycle: the history of the customer and events inthis duration; and

[0293] Customer Status: an active or inactive customer?

[0294] CVM Model 304: As discussed, the Customer Value Metric Modelranks the customer based on their Net Revenue Contribution, mileage andfrequency values. This model identifies the sub-group of high valuedcustomers. CVM Model data includes:

[0295] CustomerID;

[0296] Recency Period-the time span to determine the customer value;

[0297] Customer value measures—revenue contribution, frequency, andmileage flown; and

[0298] Ranking scores.

[0299] Customer Care 306: Unsatisfied customers are very likely tochange their air travel carriers whenever they are able to do. Customercare data provides information on the relationship between a carrier andcustomers. The customer care data elements about the airline's responseto customers influence the satisfaction level of customers andconsequently influences their decision to select the airline. Customercare includes the following data elements:

[0300] Customer Care Base: Information about customer contacts, callsreceived, complaints and complements, airline response, etc.;

[0301] Flight Incidents: One major input to customer care is a flightincident including flight cancellation, delay, missed flight, changingof route, changing of flights or carrier, etc.;

[0302] Service Factor: Another important input to customer care isservice quality, including increases in fare, changes in frequent flyerprograms, airport services, connection services, baggage services, etc.;and

[0303] Sales and Travel Agencies: The reservation and booking processaffects a customers' experience of air travel.

[0304] Booking and Reservation System (CRS) 308: Customers start theirtraveling experience with ticket booking and reservation. Throughdifferent sales channels, mostly through travel agencies, customersreserve and then purchase their tickets. The booking and reservationsystem 308 includes the following data elements:

[0305] Booking: who made the reservation;

[0306] Ticketing: who actually purchased the tickets;

[0307] Sales channels and travel agency: the media through whichcustomers reserved and purchased the tickets; and

[0308] Base fare and discount: base fare is the full price (or expectedrevenue) set by the airline; Discount shows how much the airlinediscounted any particular ticket.

[0309] Ticket 316: Ticket 316 includes tickets actually issued. Ticketdata includes:

[0310] Ticket number;

[0311] Issuing Date;

[0312] Carrier ID;

[0313] Agency ID;

[0314] Issuing city code;

[0315] Customer identification number(which may different from theCustomer FF ID);

[0316] Customer name;

[0317] Customer address;

[0318] Flight number;

[0319] Departing/destination airports;

[0320] Scheduled departing/arrival time;

[0321] Route;

[0322] Fare amount;

[0323] Airport fee;

[0324] Taxes; and

[0325] Transferring code indicating whether the passenger wastransferred from another airline, or within the same airline but to adifferent flight.

[0326] Check-in 318: When the customer actually boards the airplane, theticket sold becomes the airline's actual revenue. The check-in data willconfirm who actually flew.

[0327] Flight 310: Flight data probably is the most comprehensive andcomplete data the airlines have. Each flight represents a one-way, onetake-off-to-landing segment. This data includes:

[0328] Flight number;

[0329] Departing airport;

[0330] Destination airport;

[0331] Scheduled departure/arrival time;

[0332] Route;

[0333] Legs;

[0334] Distance flown;

[0335] Equipment;

[0336] Crew;

[0337] Service classes;

[0338] Actual departure/arrival time; and

[0339] Enplanement-number of passengers boarded on the flight.

[0340] Actual (Coupon) Revenue 320: Each passenger on a flight (exceptthe passengers on a reward flight) generates revenue to the airline. Thetrip also adds mileage flown to the frequency flyers' earned points. Thedata elements include:

[0341] Ticket number;

[0342] Ticket issued date;

[0343] Flight number;

[0344] Actual Revenue (or Coupon Revenue);

[0345] Coupon originating/destination airports;

[0346] Mileage;

[0347] Flight leg for each coupon;

[0348] Cabin codes.

[0349] Base fare;

[0350] Discounting coding; and

[0351] Agency coding.

[0352] Reward Flight 314: When frequent flyers accumulate enough pointsfrom their trips, the airline agrees to redeem these points by offeringthem a free trip to selected destinations or an upgrade in passengerservice class. A passenger flying on a reward flight generates norevenue but does incur costs to the airline. These reward flights needto be separated and identified. Furthermore, how an airline rewards itsfrequent flyers, and how a passenger uses the reward program, may havesignificant influence on loyalty/defection behavior.

[0353] Market Share 322: Market share is very important information forretention models. Since the definition of defection depends on thecompetitiveness of the market, the market share data provides a measureto every O&D market the airline serves. The market share is measured asa percentage of the following: frequency of flights, equipment used,number of stops and connections, and passenger volume.

[0354] Internal Data Sources

[0355] There are three flows in an airline: passenger, equipment andcrew. The airline's operation and planning processes focus on thesethree flows. For retention models, the equipment and crew flows are lessimportant. At the center of retention is the passenger flow. Airlinestypically possess the following operational data sources:

[0356] Customer data: Airlines may have customer data through certainchannels or contact with customers. This data covers both frequentflyers and non-frequent flyers. The data is highly valuable forretention if the records are linked back to flight and revenuedatabases.

[0357] Frequent Flyer Program Data: Airlines usually have good recordson the members of the frequent flyers program, particularly the eliteclub members.

[0358] Booking/Reservation data: Airlines have a massive reservationsystem called Computerized Reservation System (CRS) 308. The bookingprocess is conducted using this system; however, the records are usuallyshort-lived (i.e, they are purged periodically). In order to keep allthese records for at least the modeling period, a data warehouse, or afacility to store the historical data, is necessary.

[0359] Travel Agency Data: This data includes agency codes, location andbusiness types, contract type, share of sales, and loyalty of agency.

[0360] Flight data: As we have said, this is the most comprehensive andcomplete database airlines possess. Almost all operational data iscontained here or derived from here. This database covers flights,scheduling, route, airports, and other information.

[0361] Revenue Management: Revenue management is a key part of airlineoperations. Airlines use the revenue management models to forecastdemand and expected revenue. The base fare, coupon revenue, andmileage-seat capacity are found here.

[0362] Ticketing data: Ticketing data is the output of the bookingprocess; however, this data, like that in the CRS, needs to be stored ina data warehouse for modeling use.

[0363] TCN (Ticket Control Number): This data contains all informationwhen a ticket was issued (=purchased by a customer); and

[0364] PRA (Passenger Revenue Accounting): This data contains ticketingdata but only when the ticket was collected, which means that thepassenger actually boarded the airplane.

[0365] Airline data sources are usually fragmented and stored indifferent legacy systems. While reservation and flight operational dataare on mainframe computers, marketing data may be on different systems,such as Informix or other database systems. All major airlines operatein a so-called line and staff organizational structure. The lineorganization includes all departments and personnel directly involvedwith the airlines services: operations, maintenance, and sales andmarketing. The staff organization includes special departments andpersonnel such as law, accounting and finance, employee relations, andpublic relations. The airline data sources are created, maintained andused by these different departments and modeler needs to know the datasources unless there is a data warehouse in place. All operational dataneeds to be summarized.

[0366] External Data Sources

[0367] More data is always desirable and external data, includingbusiness and other socio-economic information helps interpret data andenhances predictability and accuracy of the models. However, externaldata is not cheap to secure so the marginal benefit of includingexternal data into model building is a delicate issue. Includingexternal data depends on the following considerations:

[0368] Airline's objective for the modeling project;

[0369] Availability and extent of the external data coverage;

[0370] Cost of the external data;

[0371] Analysts' experience in using the external data; and

[0372] Measurements of the modeling results improvement.

[0373] The decision to obtain external data is based upon a cost andbenefit analysis. Experience indicates that external data contributes tothe analytical models and some external data elements prove to besignificant predictive variables in the models. In addition, these dataelements provide customers classification information. Furthermore, asdescribed previously, a customer's travel pattern may be affected by ajob change or other factors. Therefore, external data, includinginformation on such issues, may be vital to derive the response variablefor the retention models.

[0374] Available data sources, public or private, external to thecarrier are discussed next.

[0375] Public Data Sources

[0376] Unlike other industries, the airline industry has vast datasources that are available in the public domain. Although the data mayneed to be purchased, use of it is generally not restricted and in somecases, the data may be available from third party vendors who havecleaned it up to make it easier to incorporate in a data warehouse. Thefollowing is a list of sources for airline related data. Some of thedata sources listed here are for U.S. airlines only. An engagement withan international airline may require more data discovery at thebeginning of the project.

[0377] Department of Transportation (DOT)

[0378] http//www.dot.gov

[0379] http//www.bts.gov

[0380] The Department of Transportation (DOT) and the Bureau ofTransportation Statistics (BTS) possess vast amounts airline data. Someof the data is listed below:

[0381] Forms 41 and 198C: Quarterly information provided by each carrierthat includes revenue, cost, employee count, and traffic (RPM, ASM, fuelusage) by equipment and by airport.

[0382] T3: Monthly airport statistics (operation, enplanement) byequipment and carrier.

[0383] T100: Monthly segment statistics (available seats, enplanement,distance, block time, schedule time) by equipment and carrier

[0384] O & D Survey: Quarterly information based on 10% of ticket sampleon each city pair served.

[0385] Airline Service Quality Performance (ASQP): Actual flight timerecords vs. published schedule for each flight.

[0386] Customer Complaint: Summarized by airline.

[0387] Federal Aviation Administration (FAA):—Terminal Area Forecast(TAF) and Historical and forecast data for annual operations,enplanement at airport level, published annually.

[0388] Official Airline Guide (OAG)

[0389] Schedule information published monthly including origin,departure time, destination, arrival time, equipment, date of service.

[0390] Boeing

[0391] Current Market Outlook: Worldwide forecast of traffic andequipment demand by region, published annually.

[0392] Rolls-Royce

[0393] Market Outlook: Worldwide forecast of traffic, equipment andengine demand by region, published annually.

[0394] NASA

[0395] Aviation System Analysis Capability (ASAC): A complex systemunder development to forecast the capacity of air space and airports,traffic volume, equipment, carriers, environment and safety.

[0396] All of the above data sources are operational-oriented, notcustomer-focused and most of them are aggregated data. However, theinformation may prove to be valuable, particularly in scheduling andmarket segmentation, to help define the defection and targetingpopulation, and thus enhance the predictive power of the model.

[0397] Private Data Sources

[0398] Other data on airlines and on related issues are available fromprivate vendors. This data usually needs to be purchased, and there arerestrictions on use and distribution of the data.

[0399] Data may be available from the following private vendors:

[0400] Dun & Bradstreet;

[0401] Acxiom;

[0402] Experian;

[0403] Credit Bureau Data Sources; and

[0404] American Express;

[0405] The data may include the following information:

[0406] Individual Personal Identification Number (PIN);

[0407] Household PIN;

[0408] General Household Information, including:

[0409] Date of Birth;

[0410] Home Owner;

[0411] Address;

[0412] Length of Residence;

[0413] Dwelling Unit Size;

[0414] Geo Code;

[0415] Census Data;

[0416] All additional household members with name, gender andrelationship; and

[0417] Number of children/age range.

[0418] Economic Data, including:

[0419] Educational Data;

[0420] Individual/Household Income (actual or estimated);

[0421] Geographic income percentile;

[0422] Occupation Category;

[0423] Employer (current, past);

[0424] Industry Mail Presence Indicator;

[0425] Home Business Indicator;

[0426] Business Owner Indicator; and

[0427] Direct Mail response.

[0428] Travel Related Data, including:

[0429] Frequent Flyer in Household;

[0430] Travel, Domestic;

[0431] Travel, International

[0432] Vacation Home/Time Sharing;

[0433] Credit Cards/Debit Cards: Card Name, Card Type, Card Category;

[0434] Rental Car data; and

[0435] Hotel Data.

[0436] Lifestyle Data, including:

[0437] Neighborhood Lifestyle Cluster;

[0438] Household Lifestyle Cluster;

[0439] Vendor Specific Data; and

[0440] Targeting Code.

[0441] Data Extraction

[0442] The data extraction tasks are now described. Data extractionconsists of mapping the data, extracting the data from all data sources,cleaning and conditioning the data, and creating the analytical datafile. This section does describe the procedures to extract data fromvarious sources to a data warehouse as there are many known methods inthe art. The goal of extracting data is to build an analytical data fileused to perform data analysis and build retention models. Therefore,successful completion of the data extraction process is a prerequisiteto conducting data analysis and analytical modeling. The data extractionprocess is separate and distinct from the analysis and modeling process.

[0443] Mapping the Data Sources

[0444] It is common for computer systems that process and store variousdata sources to be incompatible. If a data warehouse is in place, thedata warehouse will facilitate access to data that have been transformedand migrated. If there is no data warehouse, then it is necessary tobring data from different sources to the same format by using datatransport tools to transform the data, as is known in the art.

[0445] All data sources need to be mapped and linked. If there is nodata warehouse, the data is mapped with the help of airline personnel.For performing these steps, identifying a unique “Key” field isfundamental. For example, each customer may have an assigned account IDand each agent may also have an assigned Agent ID. The data should bemapped and linked according to those IDs and the following data sources:

[0446] All internal operational data sources from different legacysystems should be linked and mapped so that each customer has a uniquerecord, which includes all necessary fields;

[0447] Travel Agency data should be linked to customer data; and

[0448] If there is external data, the external data should be linkedback to internal data.

[0449] Extract Data

[0450] When all necessary data linkages are established, a software tool(such as, SAS) can be used to extract data from all the data sources andgenerate a database including all the data fields and records. Thefollowing data extraction methods can be used:

[0451] Most statistical software packages can handle data in an ASCIIflat file format;

[0452] Some software packages, such as SAS, have facilities to directlytransfer PC based files, such as dif, or .db files, to their own datafile format;

[0453] Some software packages have the facilities to directly link andinterface with database server or database systems; and

[0454] If a data warehouse, such as Teradata is in place, analysts canextract needed data elements from the data warehouse.

[0455] No matter which method or utility is used to extract the data, animportant caveat is to note the data size. It is assumed that there is alarge amount of data including hundreds and thousands of records andhundreds of fields. Some facilities may have size limits or require thatthe appropriate size limits be defined to handle the data properly.

[0456] Data Cleansing and Conditioning

[0457] When all necessary internal and external data sources areidentified and extracted, data needs to be cleaned and conditionedbecause data is rarely in a format or condition suitable for analysispurposes. The following are data cleansing and conditioningconsiderations:

[0458] Summarization—Transactional data contains very detailedinformation that is not useful to analysts and analysts decide thecorrect level of data detail. Analysts usually need to “roll up” data.For example, customers' revenue contribution is summarized on a monthlybasis though these data are stored on a per-flight basis. The time bandof the customer's flight needs to be summarized to represent thecustomer's flying pattern, where detailed up to the minute recordsexist.

[0459] Inconsistent Data Encoding—When information is gathered fromvarious sources, the same data may be represented differently. Someexamples include:

[0460] A customer ID in one data source is a ten-digit numeric numberbut in another data source it is a character field;

[0461] A revenue amount may be recorded in dollars or hundred dollarunits depending on the sources of data;

[0462] Ratios may be represented in several different ways, for example,fifty-five point four percent, can be displayed as 55.4, 0.554, or55.4%;

[0463] A negative number, such as negative ten, can be displayed eitheras −10, (10), or 10 (in red color);

[0464] All date fields (such as MM/DD/YY) need to be transformed,formatted, or coded according to the rules of the analytical software;and

[0465] Multiple abbreviations are another problem. State, city, streetaddress, name of the customer, may be coded differently, e.g.,California may appear as “CA,” “Cal.,” or “Calif.”

[0466] Textual Data—In many cases, text fields contain irrelevant dataanalysis information. If the data is relevant, it is better to re-codethe data into different easier to use data formats. It is extremelyimportant to be careful in recognizing comma, space, tab, and lettercases, to correctly code data.

[0467] Time Component of Data—Usually the data obtained from operationalsystems contain time series components, which is very importantinformation. It is very important to make the time components reflectthe time sequential nature. Particularly for some data classificationprocedures (such as CHAID). Poor representation of time sequential dataprevents the procedures from finding patterns related to time series.

[0468] As an example, if the data contains the frequency values for thepast six months, by coding the data as “FV01,” “FV02,” and so on, theprocedure recognizes that FV01 precedes FV02, FV02 precedes FV03. Inaddition, if the data has time series of mileage, coded as “ML01,”“ML02,” and so on, the procedures may not be able to find that FV01 andML01 actually occurred in the same month. Failing to recognize the timesequential nature of data causes important information to be lost.Continuous decline in FV or mileage in the past six months may indicatethat the customer's need for air travel has changed or the customer hasor is likely to, change carriers. If the data cannot capture thisinformation, the model fails in predicting this trend.

[0469] Another approach is to derive variables capturing the “changes”over time, if no time series components have been established.

[0470] Blanks, Missing Values, and Anomalies—Blanks and missing valuesare another common yet important problem. Blanks and missing values arecoded differently on legacy systems. If a data warehouse is in place,the data warehouse's data loading script may code blanks and missingvalues based on internal rule. It is important to be careful inrecognizing and coding these blanks or missing values. The following areexamples.

[0471] If a customer's number of contact field is blank, certainanalytical software may treat this as “missing”. However, this blankfield is not missing. It represents that a customer has not beencontacted by the airline. In this case, the analyst should code thisblank field as “0” instead of keeping it as a blank.

[0472] In other cases, avoid using “0” (zero) when filling in blanks ormissing values. Zero, in many systems, has specific meanings. As anexample, an external data vendor providing commercial credit score classdata uses “0” as an indication of “Out of Business” and blank as anindication of “not available.” In this case, if the blanks are treatedas “missing,” then not only will the data size be significantly reduced,but valid information is lost. In dealing with this problem, data needsto be transformed and a new variable needs to be derived

[0473] A missing value may be coded as a blank, “.”, “_”, “N/A”, “NULL”,or “99999999”. All these values need to be clarified and re-coded.

[0474] Several methods are used to fill in the blank or missing fields.However, analysts should be careful to choose one to use for the field.One way to fill the missing or blank field is to use average valuescalculated from that field, but some missing fields cannot be filledwith average, minimum, or maximum values. Again, for customer contact, amissing field may simply represent no contact, and this field should notbe filled with average or other values.

[0475] There may be some anomalies. Negative coupon revenue may be ananomaly, particularly if a customer account constantly shows negativecoupon revenue values over the investigated time. There may be negativecoupon value for a reward flight, but not for the entire period. Whenthis kind of problem is encountered, the airline personnel need toprovide some explanations as to why and how to transform or re-code thisfield. For the sake of data integrity, if no explanation or remedy isfound, this kind of data record should be eliminated from the modelingprocess.

[0476] If a client airline installed a data warehouse, most of the dataproblems are resolved through data transformation. However, some codingproblems still need to be resolved, such as how to code missing valuesor blanks. If there is no data warehouse in place, then the data needsto be cleaned and conditioned in order to generate a suitable database.

[0477] Analytical Data File

[0478] Once the data is sufficiently clean and complete, an analyticaldata file is generated. If SAS is the tool used, then follow the SASdata steps and procedures to load the data into a SAS data set. Thisanalytical data file is used for further data analysis and for themodeling process. The analytical data file should satisfy the followingcriteria:

[0479] Internal operational data, such as flight, O&D, mileage, andrevenue, are appropriately summarized;

[0480] Each record has a unique customer ID number;

[0481] No duplicate records;

[0482] If external data is available, external data records matchone-by-one with the corresponding internal data records; and

[0483] Records in the analytical data file consist of the populationbeing investigated.

[0484] It will be readily seen by one of ordinary skill in the art thatthe present invention fulfills all of the objects set forth above. Afterreading the foregoing specification, one of ordinary skill will be ableto affect various changes, substitutions of equivalents and variousother aspects of the invention as broadly disclosed herein. It istherefore intended that the protection granted hereon be limited only bythe definition contained in the appended claims and equivalents thereof.

What is claimed is:
 1. A method of building a customer retention modelcomprising the following steps: identifying data elements; identifyingdata sources; laying out a data file format; identifying statistical andanalytical packages; and applying statistical and analytical packages todata from data sources fulfilling data elements identified in the datafile format to perform customer retention.
 2. The method as claimed inclaim 1, wherein the data elements include: frequent flyer programmembership information; passenger flying data; booking channel data;ticketing data; and costs.
 3. The method as claimed in claim 1, whereinthe data sources include at least one of an internal data source and anexternal data source.
 4. The method as claimed in claim 3, wherein theinternal data source includes: customer data; revenue management data;flight scheduling data; sales channel data; and travel agency data. 5.The method as claimed in claim 3, wherein the external data sourceincludes at least one of a public data source and a private data source.6. The method as claimed in claim 5, wherein the public data sourceincludes Department of Transportation data, Federal AviationAdministration data, Official Airline Guide data, Boeing data,Rolls-Royce data, and NASA data.
 7. The method as claimed in claim 5,wherein the private data source includes Dun & Bradstreet data, Acxiomdata, Experian data, Credit Bureau Data Sources, and American Expressdata.
 8. A method of building a customer retention model comprising thefollowing steps: identifying data elements; identifying data sources;laying out a data file format; identifying statistical and analyticalpackages; and applying statistical and analytical packages to data fromdata sources fulfilling data elements identified in the data file formatto identify customers for customer retention.
 9. The method as claimedin claim 8, wherein the data elements include: frequent flyer programmembership information; passenger flying data; booking channel data;ticketing data; and costs.
 10. The method as claimed in claim 8, whereinthe data sources include at least one of an internal data source and anexternal data source.
 11. The method as claimed in claim 9, wherein theinternal data source includes: customer data; revenue management data;flight scheduling data; sales channel data; and travel agency data. 12.The method as claimed in claim 9, wherein the external data sourceincludes at least one of a public data source and a private data source.13. The method as claimed in claim 11, wherein the public data sourceincludes Department of Transportation data, Federal AviationAdministration data, Official Airline Guide data, Boeing data,Rolls-Royce data, and NASA data.
 14. The method as claimed in claim 11,wherein the private data source includes Dun & Bradstreet data, Acxiomdata, Experian data, Credit Bureau Data Sources, and American Expressdata.
 15. A method of identifying highly valued customers using aCustomer Value Metric Model comprising the following steps: identifyingcustomer value criteria; identifying customer data elements; identifyingdata sources of the data elements; applying a Customer Value MetricModel to data from the data sources in accordance with the customervalue criteria to identify high value customers.
 16. A method ofidentifying highly valued customers using a Customer Value Metric Modelcomprising: determining a frequency value for each customer; determininga net revenue contribution value for each customer; scoring thefrequency value and net revenue contribution value for each customer;and identifying the highly valued customers by ranking the customersbased on the score.
 17. The method as claimed in claim 4, comprising:ranking the customers based on the frequency value score.
 18. The methodas claimed in claim 4, comprising: ranking the customers based on thenet revenue contribution value score.
 19. The method as claimed in claim4, further comprising: sorting the scores based on score pairs includingfrequency value and net revenue contribution value.
 20. The method asclaimed in claim 19, further comprising: sorting matching score pairsbased on net revenue contribution value; dividing the customers into Ngroups; assigning a numerical value 1-N to each group; and ranking thecustomers based on the assigned numerical value to identify the highlyvalued customers.
 21. The method as claimed in claim 20, wherein N is100.